Skip to content

refactor(gradio): decompose training interface into focused modules#774

Merged
ChuxiJ merged 2 commits intoace-step:mainfrom
1larity:issue-fd-training-interface-decompose
Mar 6, 2026
Merged

refactor(gradio): decompose training interface into focused modules#774
ChuxiJ merged 2 commits intoace-step:mainfrom
1larity:issue-fd-training-interface-decompose

Conversation

@1larity
Copy link
Contributor

@1larity 1larity commented Mar 5, 2026

## Summary
This PR decomposes acestep/ui/gradio/interfaces/training.py into focused training interface builders while preserving wiring contracts and UI behavior.

It also restores/extends i18n coverage for newly introduced training/LoKr strings across en/ja/zh/he, with UTF-8-safe handling to avoid mojibake and preserve emoji labels.

What Changed

  • Kept acestep/ui/gradio/interfaces/training.py as a thin facade/orchestrator.
  • Extracted focused interface modules:
    • training_contract_ast_utils.py
    • training_dataset_builder_tab.py
    • training_dataset_tab_scan_settings.py
    • training_dataset_tab_label_preview.py
    • training_dataset_tab_save_preprocess.py
    • training_lora_tab.py
    • training_lora_tab_dataset.py
    • training_lora_tab_run_export.py
    • training_lokr_tab.py
    • training_lokr_tab_dataset.py
    • training_lokr_tab_run_export.py
  • Added decomposition contract coverage:
    • training_decomposition_contract_test.py
  • Updated i18n files:
    • acestep/ui/gradio/i18n/en.json
    • acestep/ui/gradio/i18n/ja.json
    • acestep/ui/gradio/i18n/zh.json
    • acestep/ui/gradio/i18n/he.json

Behavioral Parity

  • Preserved wiring key contracts expected by training event wiring.
  • Restored emoji section/tab/button labels for training/LoKr UI headers and actions.
  • Preserved app.title emoji/title text integrity (🎛️ ... 💡) in all touched locale files.
  • Added missing LoKr/training i18n keys so decomposed UI text resolves via t(...) instead of hardcoded drift.

Validation

Executed:

  • uv run python -m unittest acestep.ui.gradio.interfaces.training_decomposition_contract_test
    • Result: Ran 6 tests ... OK
  • uv run python -m unittest acestep.ui.gradio.events.wiring.decomposition_contract_training_test
    • Result: Ran 4 tests ... OK
  • JSON parse check for touched locales (en/ja/zh/he)
    • Result: i18n-json-ok

Scope / Out of Scope

In scope:

  • training interface decomposition
  • contract tests for decomposition integrity
  • i18n additions required by new training/LoKr UI strings
  • emoji/mojibake safety in touched locale files

Out of scope:

  • non-training UI refactors
  • API/runtime behavior changes
  • unrelated translation rewrites outside required new keys

Reviewer Focus

  1. Facade-to-helper delegation and key parity in training.py.
  2. Event wiring compatibility of returned training component keys.
  3. UI text parity for emoji-decorated headers/buttons.
  4. Locale key completeness for en/ja/zh/he and absence of mojibake in touched strings.

CodeRabbit Scope Guard

Please treat findings as in-scope only when introduced by this FD + i18n patch set.

  • Pre-existing issues outside touched training decomposition paths should be marked out-of-scope/non-blocking unless they indicate behavior drift introduced by this PR.

Manual UI Validation

  • Full UI test pass completed after this patch set.
  • Training tabs and actions validated end-to-end with no regressions observed.
  • Emoji-decorated labels/titles render correctly after i18n updates.

Summary by CodeRabbit

Release Notes

  • New Features

    • Added LoKr training support with dedicated tab and comprehensive configuration options for tensor decomposition, linear layers, and training parameters.
    • Introduced modularized dataset builder interface with organized sections for scanning, labeling, preprocessing, and saving.
    • Expanded internationalization support with new UI strings for Hebrew, Japanese, and Chinese languages.
  • Refactor

    • Restructured training interface to separate concerns into focused tabs and sections for improved usability.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 5, 2026

📝 Walkthrough

Walkthrough

This PR decomposes the monolithic training.py interface into modular, composable tab builders for dataset management, LoRA training, and LoKr training, introducing new i18n strings across four languages (English, Hebrew, Japanese, Chinese) and adding AST-based contract tests to validate the refactored architecture.

Changes

Cohort / File(s) Summary
Internationalization Updates
acestep/ui/gradio/i18n/en.json, he.json, ja.json, zh.json
Added 30+ new LoKr-related UI strings (labels, descriptions, parameter info) across four languages; minor indentation adjustments in language metadata. Updated status/export messages with emoji-based content in English.
Training UI Refactoring (Facade)
acestep/ui/gradio/interfaces/training.py
Major restructuring: removed outer gr.Tab wrapper, delegated tab composition to separate builders, introduced epoch slider defaults computation, added state management. Reduced from 818 to 51 net lines by composing dataset builder and LoRA training tabs into unified training_section.
Training Dataset Builder Decomposition
acestep/ui/gradio/interfaces/training_dataset_builder_tab.py, training_dataset_tab_scan_settings.py, training_dataset_tab_label_preview.py, training_dataset_tab_save_preprocess.py
Four new modules implementing composable dataset UI: scanner/settings, label/preview editor, and save/preprocess controls. Each exports a single builder function returning component dictionaries for wiring.
Training LoRA Tab Decomposition
acestep/ui/gradio/interfaces/training_lora_tab.py, training_lora_tab_dataset.py, training_lora_tab_run_export.py
Three new modules for LoRA training UI: facade tab builder and two control section builders for dataset/adapter settings and run/export operations.
Training LoKr Tab Decomposition
acestep/ui/gradio/interfaces/training_lokr_tab.py, training_lokr_tab_dataset.py, training_lokr_tab_run_export.py
Three new modules for LoKr training UI: facade tab builder and two control section builders for dataset/adapter settings and run/export operations with LoKr-specific parameters.
Testing & Utilities
acestep/ui/gradio/interfaces/training_contract_ast_utils.py, training_decomposition_contract_test.py
New AST utility module providing contract analysis functions (load_module, call_name, collect_return_dict_keys, collect_training_section_keys_used_by_wiring). New test class with six contract validation tests ensuring decomposition integrity, i18n coverage, and UI marker preservation.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested reviewers

  • ChuxiJ

Poem

🐰 A training UI split with grace,
Modules now have their own space,
LoKr, LoRA, dataset care—
Composed with helpers, light as air!
Tests ensure the parts align,
This decomposition's fine! ✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title clearly identifies the main change as a decomposition/refactoring of the training interface into modular components, which is accurately reflected in the raw summary showing the extraction of multiple focused builder modules from a monolithic training.py file.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@1larity 1larity marked this pull request as ready for review March 5, 2026 22:59
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

🧹 Nitpick comments (2)
acestep/ui/gradio/interfaces/training.py (1)

38-59: Apply resolved epoch defaults to LoKr tab too for debug/default parity.

Line [38] computes epoch defaults from DEBUG_TRAINING, but Line [59] calls create_training_lokr_tab() without forwarding them, so LoKr epochs remain fixed independently.

💡 Suggested direction
         training_section.update(
             create_training_lora_tab(
                 epoch_min=epoch_min,
                 epoch_step=epoch_step,
                 epoch_default=epoch_default,
             )
         )
-        training_section.update(create_training_lokr_tab())
+        training_section.update(
+            create_training_lokr_tab(
+                epoch_min=epoch_min,
+                epoch_step=epoch_step,
+                epoch_default=epoch_default,
+            )
+        )

Also mirror that signature in:

  • acestep/ui/gradio/interfaces/training_lokr_tab.py
  • acestep/ui/gradio/interfaces/training_lokr_tab_run_export.py
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@acestep/ui/gradio/interfaces/training.py` around lines 38 - 59, The epoch
defaults resolved by _resolve_epoch_slider_defaults() are not passed to
create_training_lokr_tab(), so LoKr uses hardcoded epochs; update the call in
create_training_lokr_tab() invocation to forward epoch_min, epoch_step,
epoch_default, and then modify the create_training_lokr_tab function signature
in both training_lokr_tab.py and training_lokr_tab_run_export.py to accept
(epoch_min, epoch_step, epoch_default) and use those values for the LoKr epoch
slider (and any defaults/exports) instead of the fixed constants so
debug/default parity is maintained.
acestep/ui/gradio/interfaces/training_dataset_tab_label_preview.py (1)

10-11: Expand the public function docstring to include the return contract.

Please document what keys/components are returned so the facade/wiring contract is self-describing in-code.

As per coding guidelines, "Docstrings are mandatory for all new or modified Python modules, classes, and functions. Docstrings must be concise and include purpose plus key inputs/outputs and raised exceptions when relevant."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@acestep/ui/gradio/interfaces/training_dataset_tab_label_preview.py` around
lines 10 - 11, Update the docstring for build_dataset_label_and_preview_controls
to include the return contract: list the dict keys returned (e.g., keys for
auto-label controls, sample-preview components, any Gradio Blocks/Elements) and
describe the type/value for each key and their role in the wiring/facade (for
example "auto_label_controls: Gradio Block containing inputs/buttons",
"preview_panel: Gradio Component showing sample preview", etc.), and mention any
exceptions or error conditions; keep it concise and follow existing module
docstring style.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@acestep/ui/gradio/i18n/he.json`:
- Around line 447-475: Update the Hebrew locale file to translate the remaining
English LoKr strings: replace values for keys train_section_tensors,
train_section_lora, train_section_params, lokr_section_tensors,
lokr_section_settings, lokr_linear_dim, lokr_linear_dim_info, lokr_linear_alpha,
lokr_linear_alpha_info, lokr_factor, lokr_factor_info, lokr_decompose_both,
lokr_decompose_both_info, lokr_use_tucker, lokr_use_tucker_info,
lokr_use_scalar, lokr_use_scalar_info, lokr_weight_decompose,
lokr_weight_decompose_info, lokr_learning_rate_info, lokr_checkpoint_epoch, and
lokr_checkpoint_epoch_info with appropriate Hebrew translations consistent with
existing style (keep code/backtick examples like `.pt` as-is and preserve
emoji/button text where present).

In `@acestep/ui/gradio/i18n/ja.json`:
- Around line 434-456: Several LoKr locale entries are still in English; update
the specified keys (lokr_linear_dim_info, lokr_linear_alpha_info,
lokr_factor_info, lokr_decompose_both_info, lokr_use_tucker_info,
lokr_use_scalar_info, lokr_weight_decompose_info, lokr_learning_rate_info,
lokr_checkpoint_epoch, lokr_checkpoint_epoch_info) to proper Japanese
translations consistent with surrounding entries (e.g., explain rank/dimension,
scaling factor, Kronecker factor, decomposing both sides, Tucker decomposition,
scalar gating, weight decomposition, learning rate guidance, checkpoint epoch
label and selection guidance) and replace the English strings in the ja.json
file accordingly.

In `@acestep/ui/gradio/i18n/zh.json`:
- Around line 434-456: Several LoKr localization strings in zh.json are still in
English; update the listed keys (lokr_linear_dim_info, lokr_linear_alpha_info,
lokr_factor_info, lokr_decompose_both_info, lokr_use_tucker_info,
lokr_use_scalar_info, lokr_weight_decompose_info, lokr_learning_rate_info,
lokr_checkpoint_epoch, lokr_checkpoint_epoch_info) with proper Chinese
translations consistent with the surrounding entries, preserving meaning (e.g.,
describe rank/dimension, scaling factor, Kronecker factor, decomposing both
sides, Tucker decomposition, scalar gating, weight decomposition, learning rate
guidance, and checkpoint selection) and keep any UI emojis or punctuation style
consistent with other zh.json entries.

In `@acestep/ui/gradio/interfaces/training_dataset_tab_label_preview.py`:
- Around line 70-95: The hardcoded English UI strings in the training dataset
preview need localization: update the Textbox and Dropdown initializers
(edit_caption, edit_genre, prompt_override, edit_lyrics) to use the translation
function t(...) for placeholders, the prompt_override choices/values, and any
labels currently using literals; specifically replace "Music description...",
"pop, electronic, dance...", the Dropdown choices/value ("Use Global Ratio",
"Caption", "Genre"), and the lyrics placeholder "[Verse 1]\nLyrics
here...\n\n[Chorus]\n..." with appropriate t(...) keys (e.g.,
t("training.caption_placeholder"), t("training.genre_placeholder"),
t("training.prompt_override_use_global"), t("training.prompt_override_caption"),
t("training.prompt_override_genre"), t("training.lyrics_placeholder")) so all
strings are localized.

In `@acestep/ui/gradio/interfaces/training_dataset_tab_save_preprocess.py`:
- Around line 64-70: The Dropdown instantiation for preprocess_mode hardcodes
English strings; wrap the label, choices and info text in the translation
function (t) so they render localized UI copy. Update the gr.Dropdown call for
preprocess_mode to use t("...") for label ("Preprocess For"), each choice
("LoRA", "LoKr") via translated strings or a mapped list, and the info string
("LoRA keeps compatibility mode; LoKr uses per-sample source-style context.") so
all displayed text uses t(...) instead of literal English.

In `@acestep/ui/gradio/interfaces/training_dataset_tab_scan_settings.py`:
- Around line 13-40: The UI contains hardcoded English in the quick-start
paragraph and two section headers; replace those literal strings with
translation lookups (use t("training.quick_start_message") for the paragraph and
t("training.load_section_title") and t("training.scan_section_title") for the
headings) inside the existing gr.HTML and gr.HTML("<h4>...") calls (referenced
as the first gr.HTML block and the two gr.HTML header calls) and add
corresponding keys to the locale resource files so all locales render the same
copy.
- Around line 59-95: The Dataframe headers and the LM checkbox label/info are
hardcoded English; update audio_files_table to use translated strings via the
t(...) function for each header (e.g., replace "#", "Filename", "Duration",
"Lyrics", "Labeled", "BPM", "Key", "Caption" with t(...) keys you add like
training.table_index, training.table_filename, etc.) and replace the static
label/info for format_lyrics and transcribe_lyrics with t(...) keys (e.g.,
training.format_lyrics_label, training.format_lyrics_info,
training.transcribe_lyrics_label, training.transcribe_lyrics_info) so all
user-facing text in audio_files_table, format_lyrics, and transcribe_lyrics is
localized.

In `@acestep/ui/gradio/interfaces/training_lora_tab_run_export.py`:
- Around line 91-95: The UI strings for the Textbox named resume_checkpoint_dir
are not localized; update the Textbox initialization so its label and info use
the localization function (e.g., t("...")) instead of hardcoded English text —
replace label="Resume Checkpoint" with label=t("...") and info="Directory of a
saved LoRA checkpoint to resume from" with info=t("...") in the
resume_checkpoint_dir declaration inside training_lora_tab_run_export.py to
ensure locale consistency.

In `@acestep/ui/gradio/interfaces/training.py`:
- Around line 40-47: The HTML header currently hardcodes English inside the
gr.HTML(...) call; replace the static English strings with localized text by
calling the localization function (e.g., t("...")) for both the title and
subtitle and injecting those localized values into the HTML template used in
gr.HTML. Locate the gr.HTML(...) block in training.py and change the embedded
"<h2>🎵 LoRA Training for ACE-Step</h2>" and the <p> subtitle to use t("...")
results (for example t("training.header.title") and
t("training.header.subtitle")) so the rendered UI uses translated text.

---

Nitpick comments:
In `@acestep/ui/gradio/interfaces/training_dataset_tab_label_preview.py`:
- Around line 10-11: Update the docstring for
build_dataset_label_and_preview_controls to include the return contract: list
the dict keys returned (e.g., keys for auto-label controls, sample-preview
components, any Gradio Blocks/Elements) and describe the type/value for each key
and their role in the wiring/facade (for example "auto_label_controls: Gradio
Block containing inputs/buttons", "preview_panel: Gradio Component showing
sample preview", etc.), and mention any exceptions or error conditions; keep it
concise and follow existing module docstring style.

In `@acestep/ui/gradio/interfaces/training.py`:
- Around line 38-59: The epoch defaults resolved by
_resolve_epoch_slider_defaults() are not passed to create_training_lokr_tab(),
so LoKr uses hardcoded epochs; update the call in create_training_lokr_tab()
invocation to forward epoch_min, epoch_step, epoch_default, and then modify the
create_training_lokr_tab function signature in both training_lokr_tab.py and
training_lokr_tab_run_export.py to accept (epoch_min, epoch_step, epoch_default)
and use those values for the LoKr epoch slider (and any defaults/exports)
instead of the fixed constants so debug/default parity is maintained.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 9926c490-1752-4a0a-8f77-760692b77326

📥 Commits

Reviewing files that changed from the base of the PR and between 2507a93 and 85b9983.

📒 Files selected for processing (17)
  • acestep/ui/gradio/i18n/en.json
  • acestep/ui/gradio/i18n/he.json
  • acestep/ui/gradio/i18n/ja.json
  • acestep/ui/gradio/i18n/zh.json
  • acestep/ui/gradio/interfaces/training.py
  • acestep/ui/gradio/interfaces/training_contract_ast_utils.py
  • acestep/ui/gradio/interfaces/training_dataset_builder_tab.py
  • acestep/ui/gradio/interfaces/training_dataset_tab_label_preview.py
  • acestep/ui/gradio/interfaces/training_dataset_tab_save_preprocess.py
  • acestep/ui/gradio/interfaces/training_dataset_tab_scan_settings.py
  • acestep/ui/gradio/interfaces/training_decomposition_contract_test.py
  • acestep/ui/gradio/interfaces/training_lokr_tab.py
  • acestep/ui/gradio/interfaces/training_lokr_tab_dataset.py
  • acestep/ui/gradio/interfaces/training_lokr_tab_run_export.py
  • acestep/ui/gradio/interfaces/training_lora_tab.py
  • acestep/ui/gradio/interfaces/training_lora_tab_dataset.py
  • acestep/ui/gradio/interfaces/training_lora_tab_run_export.py

@ChuxiJ ChuxiJ merged commit 4da3a49 into ace-step:main Mar 6, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants